Synthesis of emotional speech using prosodically balanced VCV segments
نویسندگان
چکیده
This paper describes a system to synthesize emotional speech based on TDPLOSA. The system has a database of VCV (vowel consonant vowel) segments for each of three emotions; anger, sadness and joy. These segments have emotional speech quality. The database contains four kinds of VCV segments which are prosodically balanced in the sense that their concatenation can generate any accent patterns of Japanese. The system also has a duration formula for each phoneme and each emotion that can estimate the length of that phoneme given its phonemic and linguistic context. For these purposes we collected a speech corpus for each emotion. Using the corpus, we derived a guideline for designing the VCV databases and performed a multiple regression analysis to derive duration formulae. Seven utterances were produced for each emotion, which were heard by twelve listeners. The emotions were correctly recognized with an average rate of 84% as the intended emotions.
منابع مشابه
طراحی و ارزیابی یک مدل بازسازی گفتار به روش همگذاری واحدهای حساس به بافت نوایی
This paper describes the design and evaluation of prosodically-sensitive concatenative units for a Persian text-to-speech (TTS) synthesis system. Thesyllables used are prosodically conditioned in the sense that a single conventional syllable is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences. The three levels of the Per...
متن کاملA study on the natural-sounding Japanese phonetic word synthesis by using the VCV-balanced word database that consists of the words uttered forcibly in two types of pitch accent
In order to synthesize natural-sounding Japanese phonetic words, a novel VCV-concatenation synthesis with an advanced word database is proposed. The word database consists of VCVbalanced phonetic words which are uttered forcibly in type-0 and type-1 pitch accents. The advantage of using the advanced word database is that a variety of VCV-segments with the same phonetic chains and the different ...
متن کاملBuilding of a Speech Corpus Optimised for Unit Selection TTS Synthesis
The paper deals with the process of designing a phonetically and prosodically rich speech corpus for unit selection speech synthesis. The attention is given mainly to the recording and verification stage of the process. In order to ensure as high quality and consistency of the recordings as possible, a special recording environment consisting of a recording session management and “pluggable” ch...
متن کاملDesign and evaluation of prosodically-sensitive concatenative units for a Korean TTS system
This paper describes the design and evaluation of prosodically-sensitive concatenative units for a Korean text-to-speech (TTS) synthesis system. The diphones used are prosodically conditioned in the sense that a single conventional diphone is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences. The four levels of the Korean...
متن کاملProsodic analysis of a multi-style corpus in the perspective of emotional speech synthesis
This paper describes the collection and analysis of a multistyle emotional speech corpus, accomplished to study the variations of some acoustical parameters. Specifically, three emotional styles were considered: happiness, sadness and anger. Speech data in a neutral style were also collected, and prosodic differences of each style with respect to this neutral baseline were quantified. According...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001